**SLURM Workload Manager** ---------------------------- **Slurm Commands** Here’s a list of some commonly used user commands. See Slurm `man pages `__ for a complete list of commands or download the `command summary PDF `__. Note that all Slurm commands start with **‘s’**. +------------------------+---------------------------------------------+ | **Command** | **Description** | +========================+=============================================+ | sbatch | Submit a job script for later execution. | +------------------------+---------------------------------------------+ | scancel | Cancel a pending or running job or job step | +------------------------+---------------------------------------------+ | srun | Parallel job launcher (Slurm analog of | | | mpirun) | +------------------------+---------------------------------------------+ | squeue | Show all jobs in the queue | +------------------------+---------------------------------------------+ | squeue -u | Show jobs in the queue for a specific user | +------------------------+---------------------------------------------+ | squeue –start | Report the expected start time for pending | | | jobs | +------------------------+---------------------------------------------+ | squeue -j | Show the nodes allocated to a running job | +------------------------+---------------------------------------------+ | scontrol show config | View default parameter settings | +------------------------+---------------------------------------------+ | sinfo | Show cluster status | +------------------------+---------------------------------------------+ **Gathering cluster information** Slurm offers the sinfo command to get an overview of the resources offered by the cluster. By default, sinfo lists the partitions that are available. A **partition** is a set of **compute nodes** (computers dedicated to workload computation) grouped logically based on either physical properties of the hardware or job scheduling policies. Typical examples include partitions dedicated to debugging where only small and short jobs can be scheduled, or partitions dedicated to visualization with nodes equipped with specific graphic cards. .. code-block:: python # sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST batch up infinite 2 alloc giga[08-09] batch up infinite 6 idle node[10-16] debug* up 30:00 8 idle node[01-07] In the above example, we see two partitions, named *batch* and *debug*. The latter is the default partition as it is marked with an asterisk. All nodes of the debug partition are idle, while two of the batch partition are being used. The nodes in this example are named giga001 to giga016. The sinfo command also lists the time limit (column TIMELIMIT) to which jobs are subject. On every cluster, jobs are limited to a maximum run time, to allow job rotation and let every user a chance to see their job being started. Generally, the larger the cluster, the smaller the maximum allowed time. The command sinfo can output the information in a node-oriented fashion, with the argument -N. .. code-block:: python # sinfo -N -l NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON node[01-02] 2 debug* idle 32 2:8:2 3448 38536 16 Intel (null) node[03,05-07] 4 debug* idle 32 2:8:2 3384 38536 16 Intel (null) node03 1 debug* down 32 2:8:2 3394 38536 16 Intel "Disk replacement" node[08-09] 2 batch allocated 32 2:8:2 246 82306 16 AMD (null) node[10-16] 7 batch idle 32 2:8:2 246 82306 16 AMD (null) With the -l argument, more information about the nodes is provided, among which the number of “CPUs” (CPUS), which is the number of processing units that the jobs can use. It should generally correspond to the number of sockets (S) times number of cores per socket (C) times number of hardware threads per core (T in the S:C:T column) but can be lower in the case some CPUs are reserved for system use. The other columns report the volatile working memory (RAM – MEMORY), the size of the local temporary disk (also called *local scratch space* – TMP_DISK), and the node “weight” (an internal parameter specifying preferences in nodes for allocations when there are multiple possibilities). The last but one column (AVAIL_FE) show so-called **features** of the nodes, that are set by the administrator, and can refer to a processor vendor or family, a specific network equipment, or any desirable feature of the node, that can be used to choose one node type to another. The last column, (REASON), if not null, describes the reason why a node would not be available. .. Note:: You can actually specify precisely what information you would like sinfo to output by using its --format argument. For more details, have a look at the command manpage with man sinfo. **Gathering job information** The squeue command shows the list of jobs which are currently running (they are in the *RUNNING* **state**, noted as ‘R’) or waiting for resources (noted as ‘PD’, short for *PENDING*). .. code-block:: python # squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 12345 debug job1 dave R 0:21 4 node[09-12] 12346 debug job2 dave PD 0:00 8 (Resources) 12348 debug job3 ed PD 0:00 4 (Priority) The above output shows that one job is running, whose name is *job1* and whose **jobid** is 12345. The jobid is a unique identifier that is used by many Slurm commands when actions must be taken about one particular job. For instance, to cancel job *job1*, you would use scancel 12345. Time is the time the job has been running until now. Node is the number of nodes which are allocated to the job, while the Nodelist column lists the nodes which have been allocated for running jobs. For pending jobs, that column gives the reason why the job is pending. In the example, job 12346 is pending because requested resources (CPUs, or other) are not available in sufficient amounts, while job 12348 is waiting for job 12346, whose priority is higher, to run. SLURM Parameter ===================== `SLURM `__ supports a multitude of different parameters. This enables you to effectively tailor your script to your need when using FEDGEN HPC . The following parameters can be used as command line parameters with sbatch and srun or in job script, see `job script examples `__. To use these parameters in a job script, start a newline with #SBTACH directive followed by the parameter. Replace <....> with the value you want, e.g. --job-name=test-job. The following tables shows the commonly used ones. **Basic Parameters** +----------------------+-----------------------------------------------+ | **Parameter** | **Function** | +======================+===============================================+ | --j | Job name to be displayed by for example | | ob-name= or -J | the squeue command | | | | +----------------------+-----------------------------------------------+ | - | Path to the file where the job output is | | -output= or -o | written to | | | | +----------------------+-----------------------------------------------+ | --error= or -e | Path to the file where the job error is | | | written to | +----------------------+-----------------------------------------------+ | --mail-type= | Turn on mail notification; type can be one of | | | BEGIN, END, FAIL, REQUEUE or ALL | +----------------------+-----------------------------------------------+ | --mail- | Email address to send notifications to | | user= | | +----------------------+-----------------------------------------------+ **Requesting Resources parameters** +----------------------+-----------------------------------------------+ | **Parameter** | **Function** | +======================+===============================================+ | --time= | Time limit for job. Job will be killed by | | | SLURM after time has run out. Format | | | days-hours:minutes:seconds | +----------------------+-----------------------------------------------+ | --nod | Number of nodes. Multiple nodes are only | | es= or -N | useful for jobs with distributed-memory (e.g. | | | MPI). | +----------------------+-----------------------------------------------+ | --mem= | Memory (RAM) per node. Number followed by | | | unit prefix K|M|G|T, e.g. 16G | +----------------------+-----------------------------------------------+ | --mem-per-cpu= | Memory (RAM) per requested CPU core. This | | | option with the value of 512 M is set as the | | | default for all partitions. | +----------------------+-----------------------------------------------+ | --ntas | Number of processes. Useful for MPI jobs. | | ks= or -n | | +----------------------+-----------------------------------------------+ | --ntasks- | Number of processes per node. Useful for MPI | | per-node= | jobs. Maximum number is node dependent | | | (number of cores) | +----------------------+-----------------------------------------------+ | --cpus-per-task | CPU cores per task. For OpenMP (i.e. shared | | = or -c | memory) or hybrid OpenMP/MPI use one. Should | | | be equal to the number of threads. | +----------------------+-----------------------------------------------+ | --exclusive | Job will not share nodes with other running | | | jobs. You will be charged for the complete | | | nodes even if you asked for less. | +----------------------+-----------------------------------------------+ **Accounting parameters** +-----------------------+----------------------------------------------+ | **Parameter** | **Function** | +=======================+==============================================+ | --account= | Project (not user) account the job should be | | | charged to. | +-----------------------+----------------------------------------------+ | --p | Partition/queue in which o run the job. | | artition= or -p | | +-----------------------+----------------------------------------------+ | --qos=<...> | The quality of service requested; can | | | be *low*, *normal* or *high* | +-----------------------+----------------------------------------------+ **Advanced Job Control parameters** +--------------------+------------------------------------------------------------------------------+ | **Parameter** | **Function** | +====================+==============================================================================+ | --array= | Submit a collection of similar jobs, | | | e.g. --array=1-10. (sbatch command only). See official | | |  `SLURM documentation `_. | | | | +--------------------+------------------------------------------------------------------------------+ | --depend | Wait with the start of the job until specified | | ency= | dependencies have been satisfied. | | | E.g. --dependency=afterok:123456 | +--------------------+------------------------------------------------------------------------------+